AITopics | score prediction

Collaborating Authors

score prediction

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

MixFormerV2: Efficient Fully Transformer Tracking Supplementary Material

Neural Information Processing SystemsApr-29-2026, 13:10:30 GMT

Then we perform more ablation studies on our MixFormerV2 framework and the model pruning route during the distillation-based model reduction. We also provide some visualization results of the prediction-token-to-search and prediction-token-to-template attention maps.

artificial intelligence, computer vision, machine learning, (14 more...)

Neural Information Processing Systems

Country: Asia > China (0.14)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Iterative Self-Improvement of Vision Language Models for Image Scoring and Self-Explanation

Tanji, Naoto, Yamasaki, Toshihiko

arXiv.org Artificial IntelligenceJun-4-2025

ABSTRACT Image scoring is a crucial task in numerous real-world applications. To trust a model's judgment, understanding its rationale is essential. This paper proposes a novel training method for Vision Language Models (VLMs) to generate not only image scores but also corresponding justifications in natural language. Leveraging only an image scoring dataset and an instruction-tuned VLM, our method enables self-training, utilizing the VLM's generated text without relying on external data or models. In addition, we introduce a simple method for creating a dataset designed to improve alignment between predicted scores and their textual justifications. By iteratively training the model with Direct Preference Optimization on two distinct datasets and merging them, we can improve both scoring accuracy and the coherence of generated explanations. Index T erms-- Vision language model, Explainable AI, Image scoring, Self-training, Direct Preference Optimization 1. INTRODUCTION Deep learning is revolutionizing image analysis, enabling automated classification and scoring with enhanced accuracy and efficiency. Examples include disease detection in medical images, defect identification in quality control, and predicting advertising effectiveness.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2506.02708

Country: Asia > Japan (0.14)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.72)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)
Information Technology > Artificial Intelligence > Natural Language > Explanation & Argumentation (0.66)
Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (0.48)

Add feedback

SC4ANM: Identifying Optimal Section Combinations for Automated Novelty Prediction in Academic Papers

Wu, Wenqing, Zhang, Chengzhi, Bao, Tong, Zhao, Yi

arXiv.org Artificial IntelligenceMay-23-2025

Novelty is a core component of academic papers, and there are multiple perspectives on the assessment of novelty. Existing methods often focus on word or entity combinations, which provide limited insights. The content related to a paper's novelty is typically distributed across different core sections, e.g., Introduction, Methodology and Results. Therefore, exploring the optimal combination of sections for evaluating the novelty of a paper is important for advancing automated novelty assessment. In this paper, we utilize different combinations of sections from academic papers as inputs to drive language models to predict novelty scores. We then analyze the results to determine the optimal section combinations for novelty score prediction. We first employ natural language processing techniques to identify the sectional structure of academic papers, categorizing them into introduction, methods, results, and discussion (IMRaD). Subsequently, we used different combinations of these sections (e.g., introduction and methods) as inputs for pretrained language models (PLMs) and large language models (LLMs), employing novelty scores provided by human expert reviewers as ground truth labels to obtain prediction results. The results indicate that using introduction, results and discussion is most appropriate for assessing the novelty of a paper, while the use of the entire text does not yield significant results. Furthermore, based on the results of the PLMs and LLMs, the introduction and results appear to be the most important section for the task of novelty score prediction. The code and dataset for this paper can be accessed at https://github.com/njust-winchy/SC4ANM.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

doi: 10.1016/j.eswa.2025.126778

2505.1633

Country:

North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
Asia > China > Jiangsu Province > Nanjing (0.04)
North America > United States > Washington > King County > Seattle (0.04)
(6 more...)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Are Large Language Models State-of-the-art Quality Estimators for Machine Translation of User-generated Content?

Qian, Shenbin, Orăsan, Constantin, Kanojia, Diptesh, Carmo, Félix do

arXiv.org Artificial IntelligenceOct-8-2024

This paper investigates whether large language models (LLMs) are state-of-the-art quality estimators for machine translation of user-generated content (UGC) that contains emotional expressions, without the use of reference translations. To achieve this, we employ an existing emotion-related dataset with human-annotated errors and calculate quality evaluation scores based on the Multi-dimensional Quality Metrics. We compare the accuracy of several LLMs with that of our fine-tuned baseline models, under in-context learning and parameter-efficient fine-tuning (PEFT) scenarios. We find that PEFT of LLMs leads to better performance in score prediction with human interpretable explanations than fine-tuned models. However, a manual analysis of LLM outputs reveals that they still have problems such as refusal to reply to a prompt and unstable output while evaluating machine translation of UGC.

emotion preservation, machine translation, translation, (10 more...)

arXiv.org Artificial Intelligence

2410.06338

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
Asia > China > Hong Kong (0.05)
Europe > Finland > Pirkanmaa > Tampere (0.04)
(6 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.98)

Add feedback

Exploring Kolmogorov-Arnold networks for realistic image sharpness assessment

Yu, Shaode, Chen, Ze, Yang, Zhimu, Gu, Jiacheng, Feng, Bizu

arXiv.org Artificial IntelligenceSep-14-2024

Score prediction is crucial in realistic image sharpness assessment after informative features are collected. Recently, Kolmogorov-Arnold networks (KANs) have been developed and witnessed remarkable success in data fitting. This study presents Taylor series based KAN (TaylorKAN). Then, different KANs are explored on four realistic image databases (BID2011, CID2013, CLIVE, and KonIQ-10k) for score prediction by using 15 mid-level features and 2048 high-level features. When setting support vector regression as the baseline, experimental results indicate KANs are generally better or competitive, TaylorKAN is the best on three databases using mid-level feature input, while KANs are inferior on CLIVE when high-level features are used. This is the first study that explores KANs for image quality assessment. It sheds lights on how to select and improve KANs on related tasks.

assessment, image quality assessment, quality assessment, (12 more...)

arXiv.org Artificial Intelligence

2409.07762

Country: Asia > China > Beijing > Beijing (0.05)

Genre: Research Report (0.50)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.53)

Add feedback

Reading ability detection using eye-tracking data with LSTM-based few-shot learning

Li, Nanxi, Wang, Hongjiang, Zhan, Zehui

arXiv.org Artificial IntelligenceSep-13-2024

Previous works demonstrated that eye-tracking data supplied meaningful information for reading ability detection, and have gained promising results by employing machine learning methods [1-18]. The eye-tracking based methods of reading ability detection fell into two main categories: the one estimated reading ability with finite number of classes [1-14], providing qualitative evaluation of subjects' reading ability. The other predicted reading ability scores with regression models [15-18], rendering quantitative evaluation of subjects' reading ability. Although the former exhibited satisfactory accuracy in detecting certain classes of abnormalities in reading, it lacked the capability of predicting exact scores of reading ability, which was emphasized in highly interactive educational environments (such as online learning) to make personal and intelligent reactions to subjects. However, precise score prediction of reading ability using eye-tracking data is not easy [15-18], especially when the sample data of subjects are few. In this paper, with few-shot learning strategy, a regression model for score prediction is proposed by combining Long Short Time Memory (LSTM) [19] and light-weighted neural networks. The proposed model exhibits higher accuracy than previous score prediction models tested on the same dataset.

eye-tracking data, reading ability, score prediction, (13 more...)

arXiv.org Artificial Intelligence

2409.08798

Country:

Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.14)
North America > United States > New York > New York County > New York City (0.04)
Asia > China > Guangdong Province > Guangzhou (0.04)

Genre: Research Report (0.50)

Industry:

Health & Medicine (0.94)
Education (0.89)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Evaluating Research Quality with Large Language Models: An Analysis of ChatGPT's Effectiveness with Different Settings and Inputs

Thelwall, Mike

arXiv.org Artificial IntelligenceAug-13-2024

Evaluating the quality of academic journal articles is a time consuming but critical task for national research evaluation exercises, appointments and promotion. It is therefore important to investigate whether Large Language Models (LLMs) can play a role in this process. This article assesses which ChatGPT inputs (full text without tables, figures and references; title and abstract; title only) produce better quality score estimates, and the extent to which scores are affected by ChatGPT models and system prompts. The results show that the optimal input is the article title and abstract, with average ChatGPT scores based on these (30 iterations on a dataset of 51 papers) correlating at 0.67 with human scores, the highest ever reported. ChatGPT 4o is slightly better than 3.5-turbo (0.66), and 4o-mini (0.66). The results suggest that article full texts might confuse LLM research quality evaluations, even though complex system instructions for the task are more effective than simple ones. Thus, whilst abstracts contain insufficient information for a thorough assessment of rigour, they may contain strong pointers about originality and significance. Finally, linear regression can be used to convert the model scores into the human scale scores, which is 31% more accurate than guessing.

chatgpt, correlation, prediction, (14 more...)

arXiv.org Artificial Intelligence

2408.06752

Country:

Oceania > New Zealand (0.04)
North America > United States > New York > New York County > New York City (0.04)
Europe > United Kingdom > England > South Yorkshire > Sheffield (0.04)
(2 more...)

Genre: Research Report > New Finding (0.86)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Scoreformer: A Surrogate Model For Large-Scale Prediction of Docking Scores

Ciudad, Álvaro, Morales-Pastor, Adrián, Malo, Laura, Filella-Mercè, Isaac, Guallar, Victor, Molina, Alexis

arXiv.org Artificial IntelligenceJun-25-2024

In this study, we present ScoreFormer, a novel graph transformer model designed to accurately predict molecular docking scores, thereby optimizing high-throughput virtual screening (HTVS) in drug discovery. The architecture integrates Principal Neighborhood Aggregation (PNA) and Learnable Random Walk Positional Encodings (LRWPE), enhancing the model's ability to understand complex molecular structures and their relationship with their respective docking scores. This approach significantly surpasses traditional HTVS methods and recent Graph Neural Network (GNN) models in both recovery and efficiency due to a wider coverage of the chemical space and enhanced performance. Our results demonstrate that ScoreFormer achieves competitive performance in docking score prediction and offers a substantial 1.65-fold reduction in inference time compared to existing models. We evaluated ScoreFormer across multiple datasets under various conditions, confirming its robustness and reliability in identifying potential drug candidates rapidly.

large-scale prediction, molecule, ormer 0, (15 more...)

arXiv.org Artificial Intelligence

2406.09346

Country:

Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.05)
Europe > Germany > Rheinland-Pfalz > Mainz (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.88)

Add feedback

Connected Speech-Based Cognitive Assessment in Chinese and English

Luz, Saturnino, Garcia, Sofia De La Fuente, Haider, Fasih, Fromm, Davida, MacWhinney, Brian, Lanzi, Alyssa, Chang, Ya-Ning, Chou, Chia-Ju, Liu, Yi-Chien

arXiv.org Artificial IntelligenceJun-18-2024

We present a novel benchmark dataset and prediction tasks for investigating approaches to assess cognitive function through analysis of connected speech. The dataset consists of speech samples and clinical information for speakers of Mandarin Chinese and English with different levels of cognitive impairment as well as individuals with normal cognition. These data have been carefully matched by age and sex by propensity score analysis to ensure balance and representativity in model training. The prediction tasks encompass mild cognitive impairment diagnosis and cognitive test score prediction. This framework was designed to encourage the development of approaches to speech-based cognitive assessment which generalise across languages. We illustrate it by presenting baseline prediction models that employ language-agnostic and comparable features for diagnosis and cognitive test score prediction. The models achieved unweighted average recall was 59.2% in diagnosis, and root mean squared error of 2.89 in score prediction.

dataset, participant, speech, (17 more...)

arXiv.org Artificial Intelligence

2406.10272

Country:

North America > Canada > Quebec > Montreal (0.04)
Asia > Taiwan > Taiwan Province > Taipei (0.04)
North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.04)
North America > United States > Indiana > Marion County > Indianapolis (0.04)

Genre: Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area > Psychiatry/Psychology (1.00)
Health & Medicine > Therapeutic Area > Neurology > Alzheimer's Disease (0.74)
Health & Medicine > Therapeutic Area > Neurology > Dementia (0.72)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback

Filters

Collaborating Authors

score prediction

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

MixFormerV2: Efficient Fully Transformer Tracking Supplementary Material

b7870bd43b2d133a1ed95582ae5d82a4-Supplemental-Conference.pdf

Iterative Self-Improvement of Vision Language Models for Image Scoring and Self-Explanation

SC4ANM: Identifying Optimal Section Combinations for Automated Novelty Prediction in Academic Papers

Are Large Language Models State-of-the-art Quality Estimators for Machine Translation of User-generated Content?

Exploring Kolmogorov-Arnold networks for realistic image sharpness assessment

Reading ability detection using eye-tracking data with LSTM-based few-shot learning

Evaluating Research Quality with Large Language Models: An Analysis of ChatGPT's Effectiveness with Different Settings and Inputs

Scoreformer: A Surrogate Model For Large-Scale Prediction of Docking Scores

Connected Speech-Based Cognitive Assessment in Chinese and English